Language Identification and Multilingual Speech Recognition Using Discriminatively Trained Acoustic Models
نویسندگان
چکیده
We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further demonstrate the effect of language identification-specific discriminative acoustic model training on both the per-language recognition accuracy as well as the accuracy of the language identification process. Experiments indicate that discriminative training leads to a small overall improvement in language identification accuracy while not affecting the speech recognition performance strongly. Furthermore, language identification is found to be more error prone and discriminative training less effective for code-mixed utterances, indicating that these may require special treatment within a multilingual speech recognition system.
منابع مشابه
Speech alignment and recognition experiments for Luxembourgish
Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...
متن کاملImprovements in Non-Verbal Cue Identification Using Multilingual Phone Strings
Today’s state-of-the-art front-ends for multilingual speechto-speech translation systems apply monolingual speech recognizers trained for a single language and/or accent. The monolingual speech engine is usually adaptable to an unknown speaker over time using unsupervised training methods; however, if the speaker was seen during training, their specialized acoustic model will be applied, since ...
متن کاملDifferent size multilingual phone inventories and context-dependent acoustic models for language identification
Experimental work using phonotactic and syllabotactic approaches for automatic language identification (LID) is presented. Various questions have originated this research: what is the best choice for a multilingual phone inventory? Can a syllabic unit be of interest to extend the scope of the modeling unit? Are context-dependent (CD) acoustic models, widely used for speech recognition, able to ...
متن کاملArticulatory-Acoustic-Feature-based Automatic Language Identification
Automatic language identification is one of the important topics in multilingual speech technology. Ideal language identification systems should be able to classify the language of speech utterances within a specific time before further processing by language-dependent speech recognition systems or monolingual listeners begins. Currently the best language identification systems are based on HMM...
متن کاملImproving Language Recognition with Multilingual Phone Recognition and Speaker Adaptation Transforms
We investigate a variety of methods for improving language recognition accuracy based on techniques in speech recognition, and in some cases borrowed from speaker recognition. First, we look at the question of language-dependent versus language-independent phone recognition for phonotactic (PRLM) language recognizers, and find that language-independent recognizers give superior performance in b...
متن کامل